Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41481: [CI] Update how extra environment variables are specified for the integration test docker job #42009

Merged
merged 10 commits into from
Jul 16, 2024

Conversation

paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Jun 6, 2024

Rationale for this change

Currently, nanoarrow and Rust are not being included in the integration tests. The command issued by archery includes multiple environment variable definitions and the rightmost ones disable the extra environment variables.

https://github.com/apache/arrow/actions/runs/9397807525/job/25881776553#step:9:353

DEBUG:archery:Executing `['docker', 'run', '--rm', '-e', 'ARCHERY_DEFAULT_BRANCH=main', '-e', 'ARCHERY_INTEGRATION_WITH_NANOARROW=1', '-e', 'ARCHERY_INTEGRATION_WITH_RUST=1', '-e', 'ARCHERY_INTEGRATION_WITH_NANOARROW=0', '-e', 'ARCHERY_INTEGRATION_WITH_RUST=0', '-e', 'ARROW_CPP_EXE_PATH=/build/cpp/debug', '-e', 'ARROW_NANOARROW_PATH=/build/nanoarrow', '-e', 'ARROW_RUST_EXE_PATH=/build/rust/debug', '-e', 'CCACHE_COMPILERCHECK=content', '-e', 'CCACHE_COMPRESS=1', '-e', 'CCACHE_COMPRESSLEVEL=6', '-e', 'CCACHE_DIR=/ccache', '-e', 'CCACHE_MAXSIZE=1G', '-e', 'GITHUB_ACTIONS=true', '-v', '/home/runner/work/arrow/arrow:/arrow', '-v', '/home/runner/work/arrow/arrow/.docker/conda-ccache:/ccache', 'apache/arrow-dev:amd64-conda-integration', '/arrow/ci/scripts/integration_arrow_build.sh /arrow /build && /arrow/ci/scripts/integration_arrow.sh /arrow /build']`
# ...
+ /arrow/ci/scripts/rust_build.sh /arrow /build
=====================================================================
Not building the Rust implementation.
=====================================================================
+ /arrow/ci/scripts/nanoarrow_build.sh /arrow /build
=====================================================================
Not building nanoarrow
=====================================================================

What changes are included in this PR?

This PR updates how environment variables are specified such that the intended value is passed to the docker build.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@paleolimbot paleolimbot marked this pull request as draft June 6, 2024 13:05
Copy link

github-actions bot commented Jun 6, 2024

⚠️ GitHub issue #41481 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jun 6, 2024
@paleolimbot paleolimbot marked this pull request as ready for review June 6, 2024 16:21
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, can we prefer -e specified by command line to default values in docker-compose.yml instead of this approach? Something like:

diff --git a/dev/archery/archery/docker/core.py b/dev/archery/archery/docker/core.py
index cb83106002..16e164562d 100644
--- a/dev/archery/archery/docker/core.py
+++ b/dev/archery/archery/docker/core.py
@@ -340,18 +340,8 @@ class DockerCompose(Command):
         service = self.config.get(service_name)
 
         args = []
-        if user is not None:
-            args.extend(['-u', user])
-
-        if env is not None:
-            for k, v in env.items():
-                args.extend(['-e', '{}={}'.format(k, v)])
-
-        if volumes is not None:
-            for volume in volumes:
-                args.extend(['--volume', volume])
-
-        if self.config.using_docker or service['need_gpu'] or resource_limit:
+        use_docker = self.config.using_docker or service['need_gpu'] or resource_limit
+        if use_docker:
             # use gpus, requires docker>=19.03
             if service['need_gpu']:
                 args.extend(['--gpus', 'all'])
@@ -396,6 +386,18 @@ class DockerCompose(Command):
             # name which we refer as image in general
             args.append(service['image'])
 
+        if user is not None:
+            args.extend(['-u', user])
+
+        if env is not None:
+            for k, v in env.items():
+                args.extend(['-e', '{}={}'.format(k, v)])
+
+        if volumes is not None:
+            for volume in volumes:
+                args.extend(['--volume', volume])
+
+        if use_docker:
             # add command from compose if it wasn't overridden
             if command is not None:
                 args.append(command)

ci/scripts/nanoarrow_build.sh Outdated Show resolved Hide resolved
ci/scripts/rust_build.sh Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting committer review Awaiting committer review awaiting changes Awaiting changes labels Jun 6, 2024
@paleolimbot
Copy link
Member Author

@kou Thank you for taking a look and apologies for taking a few days to circle back!

BTW, can we prefer -e specified by command line to default values in docker-compose.yml instead of this approach?

That is certainly what I would have expected; however, I am not sure that I feel confident modifying this part of the code. In any case, I think the change here that enables these specific jobs to run is a good one (everything pertaining to whether the tests do or do not run is controlled in the workflow file).

echo "The nanoarrow source is missing. Please clone the arrow-nanoarrow repository"
echo "to arrow/nanoarrow before running the integration tests:"
echo " git clone https://github.com/apache/arrow-nanoarrow.git path/to/arrow/nanoarrow"
echo "Not building nanoarrow because '${source_dir}' does not exist"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of this, because it implies that a typo when invoking this script, or a missing checkout, will go unnoticed.

@@ -27,18 +27,21 @@ build_dir=${2}/nanoarrow
# integration tests. Testing of the nanoarrow implementation in normal CI is handled
# by github workflows in the arrow-nanoarrow repository.

# Include in the integration test by default if the source directory is present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the flag is still off by default in Archery:

@click.option('--with-nanoarrow', type=bool, default=False,
help='Include nanoarrow in integration tests',
envvar="ARCHERY_INTEGRATION_WITH_NANOARROW")

@@ -1748,8 +1748,6 @@ services:
volumes: *conda-volumes
environment:
<<: [*common, *ccache]
ARCHERY_INTEGRATION_WITH_NANOARROW: 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious. Should we mark it =1 here and let the script take the passed value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great suggestion! I think if that happens, then the script will error if somebody doesn't have a nanoarrow checkout (and so a local run of archery docker run ... will fail perhaps unexpectedly).

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jun 17, 2024
@paleolimbot
Copy link
Member Author

@vibhatha @pitrou Thank you for taking a look! I think the right thing to do is probably to run the integration tests from the nanoarrow repo where there are fewer arrow-related constraints (e.g., copy what's done in the arrow-rs repo).

The right thing to do here is to apply @kou's suggestion to fix the root cause (unexpected behaviour of archery docker run); however, it looks like this causes problems in other places (or perhaps I applied the suggestion incorrectly!).

@kou
Copy link
Member

kou commented Jun 18, 2024

Ah, sorry. My patch was wrong. How about this?

diff --git a/dev/archery/archery/docker/core.py b/dev/archery/archery/docker/core.py
index ec422e9aaf..5be4887ea4 100644
--- a/dev/archery/archery/docker/core.py
+++ b/dev/archery/archery/docker/core.py
@@ -383,10 +383,6 @@ class DockerCompose(Command):
                     args.append(f'--memory={memory}')
                     args.append(f'--memory-swap={memory}')
 
-            # get the actual docker image name instead of the compose service
-            # name which we refer as image in general
-            args.append(service['image'])
-
         if user is not None:
             args.extend(['-u', user])
 
@@ -399,6 +395,10 @@ class DockerCompose(Command):
                 args.extend(['--volume', volume])
 
         if use_docker:
+            # get the actual docker image name instead of the compose service
+            # name which we refer as image in general
+            args.append(service['image'])
+
             # add command from compose if it wasn't overridden
             if command is not None:
                 args.append(command)

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 18, 2024
@paleolimbot paleolimbot force-pushed the ci-integration-rust-na branch 2 times, most recently from 1b938b8 to 932ccdc Compare June 24, 2024 13:51
@paleolimbot
Copy link
Member Author

@kou This seems to be working...is there any additional changes needed to this approach?

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 21238a7 into apache:main Jul 16, 2024
53 checks passed
@kou kou removed the awaiting change review Awaiting change review label Jul 16, 2024
@github-actions github-actions bot added the awaiting merge Awaiting merge label Jul 16, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 21238a7.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 14 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge Awaiting merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants